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AMENDMENTS TO THE CLAIMS 



1. (Cun^ntly amended) A (X)mputef=impl^CSteaiHet^^ selectively accessing 
a document during a current cravrf of a server computer, the document being identified by a 
document address specification, the document having been retrieved during a previous crawl, the 
method comprising: 

determining whether to access the document during the current cmvd with the aid of a 
gtatiirtioal probabilistic model tfiat is based on the prob ability that the document has changed 
since the previous crawl ; and 

accessing the document if the determination produces an mstmction indicative that the 
document at the document address specification should be accessed during the current crawl. 

2. (Currently amended) The method of Claim 1, wherein determining whether to 
access the document furth e r with the aid of a probabilistic model comprises computing a 
probability thM the document has changed since the document was retrieved during the previous 
crawL 

3. (Currently amended) The method of Claim 2, wherein computing the probability 
that [[the]] a document has changed furth e r comprises: 

selecting an active probability indicative of a proportion of documents in a plurality of 
documents thai are changing at various change rates, the plurality of documents includir^ the 
document; 

fr amin g the active probability to reflect [[an]] experience with the docimient during a 
plurality of previous crawls; and 

using the trained active probability to compute the probability that the document has 
changed. 

4. (Original) The method of Claim 3, further comprising: 

selecting the probability that the document has changed from the previous crawl as the 
active probability in the current crawl; and 

repeating the method of Claim 3 for the current crawL 

5. (Currently amended) The method of Claim 3, wherein training the active 
probability includes multiplying the active probability indicative of a change in the document by 
a training probability calculated using a DtotiGtical probabilistic model. 
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6. (Currently amended) The method of Claim 1 , wherein the statistioal probabilistic 
model further comprises : 

traming a dociiment probability distribution corresponding to the^ docum^t address 
specification to reflect [[an]] e5q)erience with the document during a plurality of previous crawls, 
the document probability disiribution mcluding a plurality of probabilities; 

detemiining from the document probability distribution a probability that the document 
has changed; and 

making a determination of whether to access the document in a current crawl based on 
the probability that the document has changed. 

7. (Currentiy amended) The method of Claim 6, further comprising: 
calculating, based on the experience with the document durixig a plurality of previous 

crawls, a discrete random variable distribution that includes a plurality of traming probabilities; 
and 

multiplying each probability in the document probability distribution by a corresponding 
training probability from the discrete random variable distribution, 

8. (Original) The method of Claim?, wherein the traming probabilities are 
calculated using a Poisson process, the Poisson process mcluding a Poisson equation (e^(-r*dt)) 
and a complementary Poisson equation (l-e'X-i^dt)). 

9. (Original) The method of Claim 8, wherein the experience with the document 
during the plurality of previous crawls is derived fixDm historical information associated with the 
document address specification, 

10. (Currently amended) A computer-readable medium having computer-executable 
instructions for retrieving one document in a plurality of documents fi-om a remote server, which 
when executed comprise: 

maintaining historical information associated with changes to the one document at th e 
r -e mote 5er\^er ; 

initiating a crawl procedure for retrieving particular documents in the plurality of 
documents; and 

determining whether to access the one document from the remote server based on [[an]] a 
probabilistic analysis of the historical information associated with the changes to tiie one 
document at - tho remote o e rvor ^ 
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1 1 . (Original) The computer-readable medium of Claim 1 0, further comprising: 

if the detennination to access the one document is positive, identifying the one document 
for retrieval during the crawl procedure; and 

attempting to retrieve all documents identified for retrieval during the crawl procedure. 

12. (Currently amended) The computer-readable medium of Claim 10, wherein 
d o tc xD iining irh o t ho rto rnt ri pvfi the d^n^m'^Tit ^^^^^ gie probabilistic axjalysis comprises: 

computing a probability that the one document has changed since the one document was 
last retrieved from the remote server. 

13. (Original) The computer-readable medium of Claim 12, wherein computing the 
probability that the one document has changed further comprises: 

beginning with a probability that a pre-defined proportion of documents in the plurality 
of documents has changed, training the probability that the pre-defined proportion of documents 
has changed using the historical mformation associated with the one document to achieve the 
probability that the one document has changed. 

14. (Original) The computer-readable medium of Claim 12, ftirther comprising 
makmg a random decision to retrieve the one document wherein the random decision is biased 
by the probability that the one document has changed. 

15. (Original) The computer-readable medium of Claim 14, wherein the random 
decision is further biased by a synchronization level configured to influence the random decision 
based on a predetermined degree of tolerance for not retrieving the one document if the 
document is likely to have changed. 

16. (Original) The computer-readable medium of Claim 14, wherein the random 
decision is made by a software routine adapted to simulate a flip of a coin. 

17. (Currently ameiided) The computer-readable medium of Claim 10, wherein: 

the historical information associated with changes to the one document includes a time 
stamp for the one document, the time stamp being indicative of [[a last]] ^ time that the one 
document was last modified when the one document was last retrieved fix)m the remote server; 
and 
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yMthgrcm the probabilistic aijalysis includes a comparison of the time stamp included in 
the historical information with anoth^ time stamp associated with the one document stored on 
the remote server. 

1 8. (Original) The computer-readable medium of Claim 17, further comprising: 

if the time stamp included in the historical mformation does not match the other time 
stamp associated with the one document stored on the remote server, identifying the one 
document for retrieval during the crawl procedure. 

19. (Cuirently amended) The computer-readable medium of Claim 10, wherein: 

the historical information associated with changes to the one document includes a hash 
value associated with the one docimient, the hash value being a representation of the one 
document; and 

wherein the probabilistic analysis includes a comparison of the hash value included in the 
historical information with another hash value calculated from information retrieved from the 
one document stored on the remote server. 

20. (Original) The computer-readable medium of Claim 19, if the hash value 
included in the historical inforaiation does not match the other hash value associated with the 
one docura^t stored on the remote server, identifying the one docxmient for retrieval during the 
crawl procedure. 
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